Overview

Dataset statistics

Number of variables7
Number of observations1982
Missing cells11
Missing cells (%)0.1%
Duplicate rows38
Duplicate rows (%)1.9%
Total size in memory108.5 KiB
Average record size in memory56.1 B

Variable types

Numeric6
DateTime1

Alerts

Dataset has 38 (1.9%) duplicate rowsDuplicates
sales is highly skewed (γ1 = 43.05439499)Skewed
sales has 549 (27.7%) zerosZeros

Reproduction

Analysis started2024-03-20 18:19:40.214139
Analysis finished2024-03-20 18:24:10.244842
Duration4 minutes and 30.03 seconds
Software versionydata-profiling vv4.2.0
Download configurationconfig.json

Variables

store_location_key
Real number (ℝ)

Distinct36
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6622.5439
Minimum1396
Maximum9807
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.6 KiB
2024-03-20T18:24:10.360268image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum1396
5-th percentile1396
Q16973
median7296
Q38142
95-th percentile9604
Maximum9807
Range8411
Interquartile range (IQR)1169

Descriptive statistics

Standard deviation2463.5767
Coefficient of variation (CV)0.37199854
Kurtosis0.32819427
Mean6622.5439
Median Absolute Deviation (MAD)846
Skewness-1.2252283
Sum13125882
Variance6069210.1
MonotonicityNot monotonic
2024-03-20T18:24:10.574446image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
8142 526
26.5%
6973 321
16.2%
7296 277
14.0%
1396 249
12.6%
9604 141
 
7.1%
4823 95
 
4.8%
7104 65
 
3.3%
1891 58
 
2.9%
9807 52
 
2.6%
7167 38
 
1.9%
Other values (26) 160
 
8.1%
ValueCountFrequency (%)
1396 249
12.6%
1842 4
 
0.2%
1891 58
 
2.9%
2063 6
 
0.3%
2428 7
 
0.4%
4823 95
 
4.8%
4861 1
 
0.1%
6905 14
 
0.7%
6941 2
 
0.1%
6946 8
 
0.4%
ValueCountFrequency (%)
9807 52
 
2.6%
9802 3
 
0.2%
9604 141
 
7.1%
8207 1
 
0.1%
8187 4
 
0.2%
8161 2
 
0.1%
8142 526
26.5%
8110 1
 
0.1%
7317 25
 
1.3%
7313 2
 
0.1%

product_key
Real number (ℝ)

Distinct811
Distinct (%)40.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.9398553 × 1014
Minimum8.1000023 × 108
Maximum1 × 1015
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.6 KiB
2024-03-20T18:24:10.788734image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum8.1000023 × 108
5-th percentile5.557742 × 109
Q17.0942302 × 109
median8.8358363 × 1010
Q31 × 1015
95-th percentile1 × 1015
Maximum1 × 1015
Range9.9999919 × 1014
Interquartile range (IQR)9.9999291 × 1014

Descriptive statistics

Standard deviation5.0005011 × 1014
Coefficient of variation (CV)1.0122769
Kurtosis-2.001431
Mean4.9398553 × 1014
Median Absolute Deviation (MAD)8.7437863 × 1010
Skewness0.024236363
Sum9.7907931 × 1017
Variance2.5005012 × 1029
MonotonicityNot monotonic
2024-03-20T18:24:11.030442image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 × 1015521
26.3%
1 × 1015285
 
14.4%
1 × 1015142
 
7.2%
7.710588004 × 101021
 
1.1%
1 × 101518
 
0.9%
8.04906003 × 10108
 
0.4%
7.710588004 × 10107
 
0.4%
4.160000017 × 10107
 
0.4%
4.141000022 × 10106
 
0.3%
7.710589 × 10105
 
0.3%
Other values (801) 962
48.5%
ValueCountFrequency (%)
810000231 1
0.1%
810000394 1
0.1%
810000413 1
0.1%
912848856 1
0.1%
928151100 1
0.1%
1125000024 1
0.1%
1150900378 1
0.1%
1150901811 1
0.1%
1204403889 1
0.1%
1660000087 1
0.1%
ValueCountFrequency (%)
1 × 1015285
14.4%
1 × 10151
 
0.1%
1 × 1015521
26.3%
1 × 10155
 
0.3%
1 × 101518
 
0.9%
1 × 1015142
 
7.2%
1 × 10153
 
0.2%
1 × 10152
 
0.1%
1 × 10151
 
0.1%
1 × 10151
 
0.1%

collector_key
Real number (ℝ)

Distinct356
Distinct (%)18.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.736261 × 1010
Minimum-1
Maximum1.4785035 × 1011
Zeros0
Zeros (%)0.0%
Negative1589
Negative (%)80.2%
Memory size15.6 KiB
2024-03-20T18:24:11.298541image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median-1
Q3-1
95-th percentile1.4113287 × 1011
Maximum1.4785035 × 1011
Range1.4785035 × 1011
Interquartile range (IQR)0

Descriptive statistics

Standard deviation5.5049437 × 1010
Coefficient of variation (CV)2.0118489
Kurtosis0.30419905
Mean2.736261 × 1010
Median Absolute Deviation (MAD)0
Skewness1.5167011
Sum5.4232692 × 1013
Variance3.0304405 × 1021
MonotonicityNot monotonic
2024-03-20T18:24:11.569690image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-1 1589
80.2%
1.345155956 × 10115
 
0.3%
1.3947 × 10114
 
0.2%
1.37314 × 10113
 
0.2%
1.39447 × 10113
 
0.2%
1.37274 × 10112
 
0.1%
1.372744939 × 10112
 
0.1%
1.373212129 × 10112
 
0.1%
1.34527 × 10112
 
0.1%
1.39481 × 10112
 
0.1%
Other values (346) 368
 
18.6%
ValueCountFrequency (%)
-1 1589
80.2%
1.34401893 × 10111
 
0.1%
1.34405 × 10111
 
0.1%
1.344085042 × 10111
 
0.1%
1.3440982 × 10111
 
0.1%
1.34410033 × 10111
 
0.1%
1.344130483 × 10111
 
0.1%
1.34415 × 10111
 
0.1%
1.344522226 × 10111
 
0.1%
1.344524125 × 10111
 
0.1%
ValueCountFrequency (%)
1.478503527 × 10111
0.1%
1.478414164 × 10111
0.1%
1.47840146 × 10111
0.1%
1.428152676 × 10111
0.1%
1.428146626 × 10111
0.1%
1.428140354 × 10112
0.1%
1.428111318 × 10111
0.1%
1.42811 × 10111
0.1%
1.428106383 × 10111
0.1%
1.428071675 × 10111
0.1%
Distinct331
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
Minimum2015-03-31 00:00:00
Maximum2016-10-21 00:00:00
2024-03-20T18:24:11.821572image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:24:12.075041image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

sales
Real number (ℝ)

SKEWED  ZEROS 

Distinct448
Distinct (%)22.7%
Missing7
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean20.145732
Minimum-62.3
Maximum10299.58
Zeros549
Zeros (%)27.7%
Negative6
Negative (%)0.3%
Memory size15.6 KiB
2024-03-20T18:24:12.340973image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum-62.3
5-th percentile0
Q10
median7.08
Q317.155
95-th percentile55.158
Maximum10299.58
Range10361.88
Interquartile range (IQR)17.155

Descriptive statistics

Standard deviation233.93189
Coefficient of variation (CV)11.611983
Kurtosis1891.5499
Mean20.145732
Median Absolute Deviation (MAD)7.08
Skewness43.054395
Sum39787.82
Variance54724.127
MonotonicityNot monotonic
2024-03-20T18:24:12.593982image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 549
27.7%
7.32 54
 
2.7%
3.19 30
 
1.5%
8.9 30
 
1.5%
5.32 26
 
1.3%
1.78 26
 
1.3%
20.45 24
 
1.2%
4.97 22
 
1.1%
9.77 22
 
1.1%
8.88 22
 
1.1%
Other values (438) 1170
59.0%
ValueCountFrequency (%)
-62.3 1
 
0.1%
-55.14 1
 
0.1%
-40.92 1
 
0.1%
-39.39 1
 
0.1%
-35.58 1
 
0.1%
-9.49 1
 
0.1%
0 549
27.7%
0.02 2
 
0.1%
0.09 6
 
0.3%
0.18 1
 
0.1%
ValueCountFrequency (%)
10299.58 1
0.1%
748.99 1
0.1%
424.19 1
0.1%
389.27 1
0.1%
381.17 1
0.1%
372.45 1
0.1%
332.09 1
0.1%
268.83 1
0.1%
251.89 1
0.1%
240.03 1
0.1%

units
Real number (ℝ)

Distinct11
Distinct (%)0.6%
Missing4
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean1.1314459
Minimum-2
Maximum18
Zeros0
Zeros (%)0.0%
Negative17
Negative (%)0.9%
Memory size15.6 KiB
2024-03-20T18:24:12.794881image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum-2
5-th percentile1
Q11
median1
Q31
95-th percentile2
Maximum18
Range20
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.75462629
Coefficient of variation (CV)0.66695746
Kurtosis149.25188
Mean1.1314459
Median Absolute Deviation (MAD)0
Skewness9.0107175
Sum2238
Variance0.56946083
MonotonicityNot monotonic
2024-03-20T18:24:12.968726image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
1 1801
90.9%
2 106
 
5.3%
3 22
 
1.1%
4 16
 
0.8%
-1 15
 
0.8%
7 7
 
0.4%
5 5
 
0.3%
-2 2
 
0.1%
8 2
 
0.1%
6 1
 
0.1%
(Missing) 4
 
0.2%
ValueCountFrequency (%)
-2 2
 
0.1%
-1 15
 
0.8%
1 1801
90.9%
2 106
 
5.3%
3 22
 
1.1%
4 16
 
0.8%
5 5
 
0.3%
6 1
 
0.1%
7 7
 
0.4%
8 2
 
0.1%
ValueCountFrequency (%)
18 1
 
0.1%
8 2
 
0.1%
7 7
 
0.4%
6 1
 
0.1%
5 5
 
0.3%
4 16
 
0.8%
3 22
 
1.1%
2 106
 
5.3%
1 1801
90.9%
-1 15
 
0.8%

trans_key
Real number (ℝ)

Distinct1927
Distinct (%)97.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.4775147 × 1025
Minimum9.6072966 × 1021
Maximum9.3065169 × 1026
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.6 KiB
2024-03-20T18:24:13.219872image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum9.6072966 × 1021
5-th percentile6.628896 × 1023
Q13.6672232 × 1024
median3.0786285 × 1025
Q36.1049635 × 1025
95-th percentile1.7641637 × 1026
Maximum9.3065169 × 1026
Range9.3064208 × 1026
Interquartile range (IQR)5.7382412 × 1025

Descriptive statistics

Standard deviation1.0714889 × 1026
Coefficient of variation (CV)1.9561589
Kurtosis27.81512
Mean5.4775147 × 1025
Median Absolute Deviation (MAD)2.743367 × 1025
Skewness4.7081981
Sum1.0856434 × 1029
Variance1.1480885 × 1052
MonotonicityNot monotonic
2024-03-20T18:24:13.464969image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6.750796041 × 10244
 
0.2%
6.836096041 × 10233
 
0.2%
6.869096041 × 10243
 
0.2%
6.723496041 × 10243
 
0.2%
6.130311396 × 10252
 
0.1%
6.535696041 × 10242
 
0.1%
3.18016 × 10242
 
0.1%
1.715488142 × 10262
 
0.1%
6.736296041 × 10242
 
0.1%
3.161516973 × 10252
 
0.1%
Other values (1917) 1957
98.7%
ValueCountFrequency (%)
9.607296589 × 10211
0.1%
1.957167118 × 10221
0.1%
2.227296589 × 10221
0.1%
2.527167118 × 10221
0.1%
2.667167118 × 10221
0.1%
3.303729659 × 10221
0.1%
3.707167118 × 10221
0.1%
3.80717 × 10221
0.1%
4.00717 × 10221
0.1%
4.10771756 × 10221
0.1%
ValueCountFrequency (%)
9.306516905 × 10261
0.1%
9.280676905 × 10261
0.1%
9.224066905 × 10261
0.1%
9.199276905 × 10261
0.1%
9.191826905 × 10261
0.1%
9.173326905 × 10261
0.1%
9.129276905 × 10261
0.1%
7.955687226 × 10261
0.1%
7.860737226 × 10261
0.1%
7.818451891 × 10261
0.1%

Interactions

2024-03-20T18:22:41.098374image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:19:40.599219image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:20:09.447404image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:20:57.854319image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:21:34.926500image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:22:12.814552image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:22:52.519194image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:19:40.781167image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:20:13.316673image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:20:59.503287image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:21:36.656984image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:22:13.010049image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:23:07.868497image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:19:46.203298image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:20:21.539677image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:21:07.519932image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:21:43.473145image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:22:18.387894image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:23:20.319649image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:19:48.668912image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:20:29.102331image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:21:11.202690image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:21:48.923003image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:22:21.098533image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:23:33.009820image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:19:51.756281image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:20:35.302553image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:21:15.480139image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:21:53.368671image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:22:24.167940image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:23:45.168752image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:19:51.941367image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:20:38.773923image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:21:17.115231image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:21:55.362145image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-03-20T18:22:24.338541image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Correlations

2024-03-20T18:24:13.663679image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
store_location_keyproduct_keycollector_keysalesunitstrans_key
store_location_key1.0000.047-0.159-0.006-0.044-0.210
product_key0.0471.000-0.071-0.048-0.0450.272
collector_key-0.159-0.0711.0000.1120.0230.061
sales-0.006-0.0480.1121.0000.089-0.070
units-0.044-0.0450.0230.0891.0000.005
trans_key-0.2100.2720.061-0.0700.0051.000

Missing values

2024-03-20T18:24:09.704195image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-20T18:24:09.941104image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-20T18:24:10.139468image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

store_location_keyproduct_keycollector_keytrans_dtsalesunitstrans_key
09604999999999999513-12015-10-220.001651769604118220151022933
11396999999999999134-12015-11-237.10162432813961182201511231134
2482360815062031345852875072015-04-1525.7912841948231182201504151317
3139677105880035-12015-11-091.76162291113961182201511091111
49604999999999999513-12015-12-300.001680199604118220151230855
5980777105810740-12015-09-2915.1112056098075671201509291354
613968390000626-12015-05-214.97160529413961182201505211843
79604999999999999513-12015-12-180.0016765796041182201512181212
848231660000087-12015-09-171.5813504948231182201509171216
91396999999999999513-12015-11-027.32162221313961182201511021422
store_location_keyproduct_keycollector_keytrans_dtsalesunitstrans_key
197271045847810237-12015-11-1374.74110928371041182201511131741
1973729663348204481372628653172015-10-165.86155923729622282201510162142
197472966230070609-12015-11-0812.23360641729623570201511081447
197572969999999999992001396830458742015-12-0814.24266082729622282201512081502
197672966340002130-12015-11-145.32161759729622967201511141724
19777296999999999999142-12015-12-260.00169280729623570201512261217
197871044155405415-12015-05-2320.451394837710421852201505231941
1979710436073423509601413394112362015-10-2014.76129907710422739201510201425
1980729633000002001401951312552016-01-0820.45171511729622282201601081509
1981710468000792821345468094492015-07-213.191408498710422755201507211639

Duplicate rows

Most frequently occurring

store_location_keyproduct_keycollector_keytrans_dtsalesunitstrans_key# duplicates
259604999999999999513-12015-12-150.00167507960411822015121510164
229604999999999999513-12015-12-080.00167234960411822015120811133
319604999999999999513-12016-01-080.0016836096041182201601087183
339604999999999999513-12016-01-180.00168690960411822016011810023
01396999999999999513-12015-11-270.001624840139611822015112711242
11396999999999999513-12015-11-307.321625057139611822015113010122
24823999999999999513-12015-10-233.54136589482311822015102311532
34823999999999999513-12015-12-020.00138266482311822015120212522
469731E+151.39481E+1112/23/201522.7313.18016E+242
56973999999999999513-12015-11-060.001316151697311822015110610352